Hedging Bets in Markov Decision Processes
نویسندگان
چکیده
The classical model of Markov decision processes with costs or rewards, while widely used to formalize optimal decision making, cannot capture scenarios where there are multiple objectives for the agent during the system evolution, but only one of these objectives gets actualized upon termination. We introduce the model of Markov decision processes with alternative objectives (MDPAO) for formalizing optimization in such scenarios. To compute the strategy to optimize the expected cost/reward upon termination, we need to figure out how to balance the values of the alternative objectives. This requires analysis of the underlying infinite-state process that tracks the accumulated values of all the objectives. While the decidability of the problem of computing the exact optimal strategy for the general model remains open, we present the following results. First, for a Markov chain with alternative objectives, the optimal expected cost/reward can be computed in polynomial-time. Second, for a single-state process with two actions and multiple objectives we show how to compute the optimal decision strategy. Third, for a process with only two alternative objectives, we present a reduction to the minimum expected accumulated reward problem for one-counter MDPs, and this leads to decidability for this case under some technical restrictions. Finally, we show that optimal cost/reward can be approximated up to a constant additive factor for the general problem. 1998 ACM Subject Classification G.3 Probability and Statistics
منابع مشابه
Rational Bar Bets
Various \no trade" theorems suggest that rational agents who start from a common prior should not make speculative bets with each other. To support speculation, market microstructure models typically invoke agents with dynamic hedging demands. Thus a \bar bet", a simple bet on a risk-irrelevant topic negotiated in an informal but non-private social context, seems irrational. We might, however, ...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملYeast Survive by Hedging Their Bets
A new experimental approach reveals a bet hedging strategy in unstressed, clonal yeast cells, whereby they adopt a range of growth states that correlate with expression of a trehalose-synthesis regulator and predict resistance to future stress.
متن کامل15 Markov Decision Processes in Finance and Dynamic Options
In this paper a discrete-time Markovian model for a nancial market is chosen. The fundamental theorem of asset pricing relates the existence of a martingale measure to the no-arbitrage condition. It is explained how to prove the theorem by stochastic dynamic programming via portfolio optimization. The approach singles out certain martingale measures with additional interesting properties. Furth...
متن کاملMarkov Switching GARCH models for Bayesian Hedging on Energy Futures Markets
A new Bayesian multi-chain Markov Switching GARCHmodel for dynamic hedging in energy futures markets is developed: a system of simultaneous equations for return dynamics on the hedged portfolio and futures is introduced. More specifically, both the mean and variance of the hedged portfolio are assumed to be governed by two unobserved discrete state processes, while the futures dynamics is drive...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016